BY Dhruv Parthasarathy
(Director of AI Programs @ Udacity)

interactive-visualization-of-why-eigenvectors-matter-3eb4afe7-db87-4754-9b3a-3f2a5973bb00

I n t e r a c t i v e V i s u a l i z a t i o n o f W h y E i g e n v e c t o r s M a t t e r I n t e r a c t i v e V i s u a l i z a t i o n o f W h y E i g e n v e c t o r s M a t t e r InteractiveVisualizationofWhyEigenvectorsMatterInteractive\,Visualization\,of\,Why\,Eigenvectors\,MatterInteractiveVisualizationofWhyEigenvectorsMatter

You've probably heard the word eigenvectors hundreds of times in class and have been asked to calculate it hundreds more. But why is this concept so central in Linear Algebra? Why is it that we study the eigenvector of a matrix so often?
In this post, we'll see how Eigenvectors help us immediately understand what a linear function will do to an input. We'll do so by playing with an interactive visualization that allows us to see just that.

Quick Refresher

Embedded Image
Here, applying A A AAA on its eigenvector v v vvv leades to a new vector λ v λ v lambda v\lambda vλv is in the same direction as v v vvv. Image Source: Wikipedia.

Let's do a quick refresher to begin with. The eigenvector of a linear function A A AAA is just the vector v v vvv s.t. A v = λ v A v = λ v Av=lambda vAv = \lambda vAv=λv for some constant λ λ lambda\lambdaλ which we call the eigenvalue. At a high level, the eigenvector is just a dimension along which the linear function only stretches its input (for real valued eigenvalues).
This is great and all but what can we do once we find an Eigenvector? What does it tell us about the underlying Matrix?

Linear Functions Pull Inputs Towards the Dominant Eigenvector

Let's start with an example. Take the following linear function:
A = [ 3 2 1 4 ] A = 3 2 1 4 A=[[3,2],[1,4]]A = \begin{bmatrix}3 & 2 \\ 1 & 4\end{bmatrix}A=[3214]
Let's see what happens when we apply A A AAA repeatedly on the input [ 0 1 ] 0 1 [[0],[1]]\begin{bmatrix} 0 \\ 1\end{bmatrix}[01].
{% comming soon!! %}
To play with this visualization, do the following:
  1. Drag the slider to increase the number of times we apply A A AAA.
  2. Notice how the output vector tilts towards v 1 v 1 v_(1)v_1v1, an eigenvector of A A AAA.
So just by knowing an eigenvector of A A AAA (namely the dominant eigenvector), we can get a sense of what A A AAA does - A A AAA pulls its input towards the axis of the dominant eigenvector.
How is this happening? And why is it only towards one eigenvector?

Breaking Up the Input With an Eigenbasis

You'll be surprised to see that this behavior comes very naturally from the properties of linear functions. Let's see this with an example.
Let's keep the same matrix/linear function A = [ 3 2 1 4 ] A = 3 2 1 4 A=[[3,2],[1,4]]A = \begin{bmatrix}3 & 2 \\ 1 & 4\end{bmatrix}A=[3214], and analyze applying A A AAA three times on the input v = [ 0 1 ] v = 0 1 v=[[0],[1]]v = \begin{bmatrix} 0 \\ 1 \end{bmatrix}v=[01] (i.e. A 3 v A 3 v A^(3)vA^3vA3v).

Embedded Image
In the standard way, we'd just use standard matrix multiplication to find A 3 v . A 3 v . A^(3)v.A^3v.A3v.

The standard way to do this is to simply follow the rules of multiplication and carry out A ( A ( A ( v ) ) ) . A ( A ( A ( v ) ) ) . A(A(A(v))).A(A(A(v))).A(A(A(v))). But instead, let's do this a different way using eigenvectors.

Embedded Image
In the following discussion we will split v v vvv into a linear combination of A A AAA's eigenvectors. We then apply A 3 A 3 A^(3)A^3A3 to each of these pieces and combine the result.

We know that that any vector v v vvv can be written as the sum of the eigenvectors of A A AAA. After all, eigenvectors are linearly independent and form a basis for the space (if the matrix A A AAA is diagonalizable, which it is). If v 1 v 1 v_(1)v_1v1 and v 2 v 2 v_(2)v_2v2 are the eigenvectors of A , A , A,A,A, we can break up v v vvv as:
v = c 1 v 1 + c 2 v 2 v = c 1 v 1 + c 2 v 2 v=c_(1)*v_(1)+c_(2)*v_(2)v = c_1 \cdot v_1 + c_2 \cdot v_2v=c1v1+c2v2 for some constants c 1 c 1 c_(1)c_1c1 and c 2 . c 2 . c_(2).c_2.c2.

Embedded Image
We first split v v vvv into its eigenvector subcomponents.

When we have this representation, we can then rethink A 3 v A 3 v A^(3)vA^3vA3v as:
A 3 v = A 3 ( c 1 v 1 + c 2 v 2 ) A 3 v = A 3 ( c 1 v 1 + c 2 v 2 ) A^(3)v=A^(3)(c_(1)*v_(1)+c_(2)*v_(2))A^3v = A^3(c_1 \cdot v_1 + c_2 \cdot v_2)A3v=A3(c1v1+c2v2)
or more simply:
A 3 v = c 1 A 3 v 1 + c 2 A 3 v 2 A 3 v = c 1 A 3 v 1 + c 2 A 3 v 2 A^(3)v=c_(1)*A^(3)v_(1)+c_(2)*A^(3)v_(2)A^3v = c_1 \cdot A^3v_1 + c_2\cdot A^3v_2A3v=c1A3v1+c2A3v2

Embedded Image
We then carry out A 3 v 1 A 3 v 1 A^(3)v_(1)A^3v_1A3v1 and A 3 3 v 2 . A 3 3 v 2 . A^(3)3v_(2).A^33v_2.A33v2.

We then carry out the computation of A 3 v 1 A 3 v 1 A^(3)v_(1)A^3v_1A3v1 and A 3 v 2 A 3 v 2 A^(3)v_(2)A^3v_2A3v2. Thanks to v 1 v 1 v_(1)v_1v1 and v 2 v 2 v_(2)v_2v2 being eigenvectors, we have:
A 3 v 1 = λ 1 3 v 1 A 3 v 1 = λ 1 3 v 1 A^(3)v_(1)=lambda_(1)^(3)v_(1)A^3v_1 = \lambda_1^3v_1A3v1=λ13v1
A 3 v 2 = λ 2 3 v 2 A 3 v 2 = λ 2 3 v 2 A^(3)v_(2)=lambda_(2)^(3)v_(2)A^3v_2 = \lambda_2^3v_2A3v2=λ23v2

Embedded Image
We finally combine the results to get A 3 v . A 3 v . A^(3)v.A^3v.A3v.

We then finally combine the results to get A 3 v . A 3 v . A^(3)v.A^3v.A3v. We find:
A 3 v = c 1 λ 1 3 v 1 + c 2 λ 2 3 v 2 A 3 v = c 1 λ 1 3 v 1 + c 2 λ 2 3 v 2 A^(3)v=c_(1)lambda_(1)^(3)v_(1)+c_(2)lambda_(2)^(3)v_(2)A^3v = c_1 \lambda_1^3v_1 + c_2 \lambda_2^3v_2A3v=c1λ13v1+c2λ23v2

Dominant Eignevalues and Eigenvectors

Now, what happens when | λ 1 | | λ 1 | |lambda_(1)||\lambda_1||λ1| is larger than | λ 2 | | λ 2 | |lambda_(2)||\lambda_2||λ2| (i.e. there exists a dominant eigenvalue)? In this example, λ 1 = 5 λ 1 = 5 lambda_(1)=5\lambda_1 = 5λ1=5 and λ 2 = 2. λ 2 = 2. lambda_(2)=2.\lambda_2 = 2.λ2=2. Let's now display what it would look like to carry out A 3 v A 3 v A^(3)vA^3vA3v when we have this difference in eigenvalues.
The interaction below shows this setup:


{% include eigenvectors.html %}


  1. Drag the slider to increase or decrease the number of times we apply A A AAA on v . v . v.v.v.
  2. Notice how "Output Eigenvector 1" and "Output Eigenvector 2" change at different rates.
  3. Notice how "Final Output Vector" tilts towards "Output Eigenvector 1" as you drag the slider to the right.
We thus see that when there's one eigenvalue larger than the other ( | λ 1 | > | λ 2 | | λ 1 | > | λ 2 | |lambda_(1)| > |lambda_(2)||\lambda_1| > |\lambda_2||λ1|>|λ2|), the linear function pushes its inputs towards the eigenvector associated with that large eigenvalue ("Output EigenVector One"). The more times we apply A A AAA, the larger this effect.
Note this "push" effect will only happen towards this eigenvector with the largest eignevalue - not any of the other eigenvectors.

Why this happens

This tilt towards "Output Vector One" happens due to exponential growth. λ 1 x λ 1 x lambda_(1)^(x)\lambda_1^xλ1x grows much faster than λ 2 x λ 2 x lambda_(2)^(x)\lambda_2^xλ2x. As such the more times we apply A A AAA ( x x xxx in our exponentials), the bigger the difference between λ 1 x λ 1 x lambda_(1)^(x)\lambda_1^xλ1x and λ 2 x . λ 2 x . lambda_(2)^(x).\lambda_2^x.λ2x. Hence the v 1 v 1 v_(1)v_1v1 term has much more weight in the final sum. This increasing difference is shown in the plot below.

Embedded Image
Due to the power of exponentials, the dominant eigenvector will play a bigger and bigger role the more times we apply A. Notice how the distance between the two expontial functions increases with x.

Conclusion

So, just using the properties of linear functions, we are able to see why eigenvectors are so important. They show us where a linear function will "push" its inputs.
If you've enjoyed this post on eigenvectors, check out the following additional posts on the topic I've written:
  1. You could have come up with eigenvectors. Here's how.
  2. How Eigenvectors Power PageRank - the algorithm behind Google Search.
Thanks for reading!

Caveat

  1. Everything I've discussed is for real eigenvalues.
  2. This only applies for matrices that are diagonalizable.

Credits

Thanks to Luis Serrano, Rouzbeh Shirvani, and Pranav Ramkrishnan for feedback.